NDI-python: 2026-06 consolidation — audit remediation + app frameworks + schema-v2 dual-accessor#60
Open
audriB wants to merge 66 commits into
Open
Conversation
The mechanical MATLAB-mirror class rename (87fea89) was executed as a blind string replacement of the 73 old class names. Because many old names were common English words (Data, Lab, Database, Document, Element, Session, Subject, Epoch, Dataset, Icon, Query, Cache...), the replace also rewrote prose, URLs, UI strings, MATLAB references, import paths, and external-library names that merely contained those words. This commit reverts the collateral damage while keeping every intended class rename. Functional fixes: - ndi.gui.component was unimportable: module files were never renamed, but import statements had their module-path segments rewritten (e.g. ndi.gui.component.abstract.ndi_gui_component_abstract_ProgressMonitor -> the real ndi.gui.component.abstract.ProgressMonitor). Same for the ndi.gui lazy-import table (docViewer entry). - ndi.cloud.admin.crossref raised ImportError: "from xml.etree.ElementTree import ndi_element" -> Element. - ProgressMonitor's public API mirrored MATLAB's ProgressTracker property; the kwarg/attribute had been renamed to the flattened class token. Restored to ProgressTracker (type annotations keep the class). - ontology/providers.py EMPTY-ontology fetch URL pointed at the nonexistent Waltham-ndi_gui_Data-Science org (live 404). tests/matlab_tests/test_dabrowska.py::TestOntologyIntegration:: test_empty_ontology_lookup fails on main and passes with this change. - version() and app URLs, docs clone URLs: Waltham-Data-Science restored. Contract/metadata fixes: - gui + gui/component + common bridge YAMLs: matlab_path/python_path/ type_matlab/inherits and entry name: fields pointed at phantom files (+ndi/+gui/ndi_gui_Data.m etc.); restored to the real MATLAB names verified against NDI-matlab @ 2d76370. Entry name: fields follow the root bridge convention (MATLAB short names). - file/navigator epochprobemap_class default: the corrupted value "ndi.epoch.ndi_epoch_epochprobemap" was restored to MATLAB's actual default "ndi.epoch.epochprobemap_daqsystem" (navigator.m:61,92) rather than the pre-corruption "ndi.epoch.EpochProbeMap", which exists in neither language. The field is serialized into filenavigator documents that MATLAB evals, so only the MATLAB class name is correct. - UI strings matched back to the MATLAB GUI sources: window title "Neuroscience Data Interface", tab "Database View", QLabel "Database". Prose/docs/tests: ~380 sites restored across docstrings, comments, error messages ("Document {id} not found", "Element {i} is empty"), plot titles, assert/skip messages, AGENTS.md, .cursorrules, docs/*.md ("Neuroscience Data Interface", "VH-Lab", "Nielsen Lab", "Lab mouse", "Document any intentional divergences"), the did.Query external reference, requests.Session, and the glued TestEpochId2Element class name. References that legitimately name renamed Python classes were left on the new names. Adds tests/test_gui_imports.py: gui package importability, lazy-table resolution, ProgressTracker kwarg parity, crossref import, version URL, bridge python_path existence, and a repo-wide guard that the spurious token cannot reappear outside src/ndi/gui. Validation: black --check and ruff check clean. Full suite (pytest tests/ -q --ignore=tests/symmetry) failure set is identical to origin/main in this environment (62 pre-existing environmental failures: live-network, local-dataset, and dependency-version issues), plus the dabrowska EMPTY-ontology test now passes and the 9 new tests pass.
The ndi.cloud.sync engine moved documents incorrectly or unsafely. This
rebuilds the five sync operations, the sync index, and the binary-upload
path against the MATLAB source of truth.
C1 — local state from the dataset, full documents uploaded.
The sync ops read "local state" from the sync index (so a freshly added
local document was invisible) and uploaded stub bodies {"ndiId": id}
instead of the document. They now enumerate local documents from the
dataset database (internal.listLocalDocuments) and upload full document
bodies via upload.uploadDocumentCollection.
listLocalDocuments was reading dataset.session.database_search, but an
ndi.dataset has no public .session (only _session) — the AttributeError
was swallowed and every op saw zero local documents. It now uses
dataset.database_search, matching MATLAB listLocalDocuments.m, which also
traverses linked sessions.
C2 — sync index camelCase compatibility.
Python wrote snake_case keys to .ndi/sync/index.json while MATLAB writes
camelCase to the same file, so alternating clients saw an empty index and
re-transferred everything. SyncIndex.read now accepts both dialects;
SyncIndex.write emits MATLAB's camelCase (localDocumentIdsLastSync,
remoteDocumentIdsLastSync, lastSyncTimestamp).
C3 — binary file upload.
uploadFilesForDatasetDocuments read a nonexistent top-level file_uid and
uploaded nothing without error. It now resolves binary UIDs and their
local paths from files.file_info[].locations[] and the dataset binary
store, filters already-uploaded files, and uploads each. uploadSingleFile's
bulk branch passed the {url, jobId} dict where putFiles expects a URL
string (a guaranteed pydantic failure); it now passes url + jobId.
uploadToNDICloud mis-unpacked zipForUpload's (Path, list) return as
(success, msg) and is rebuilt on uploadDocumentCollection. scanForUpload
and the uploadDocumentCollection dedup/manifest now key off base.id rather
than a top-level ndiId.
C4 — downloads enter the database.
Downloaded documents were written as raw JSON into .ndi/documents/ where
database_search could not find them, and deleteLocalDocuments removed that
JSON cache rather than database documents. Downloads now convert via
jsons2documents and dataset.database_add; deleteLocalDocuments uses
dataset.database_rm. A re-synced document already in the database (which
the real ndi_database raises as ValueError) is counted as present, not a
failure.
Decision (locked): twoWaySync is strictly additive — MATLAB's twoWaySync.m
never deletes, so a remote (or local) deletion is not propagated. Only the
mirror modes delete. The previous code propagated deletions, which combined
with the index-only local view could silently delete local data.
Failed uploads no longer advance the sync index (mirrors MATLAB's issue-805
guard): if any document or binary upload fails, the index is left unchanged
so the failed documents are retried on the next sync rather than being
treated as already-synced.
The signature of the sync ops changed from a dataset path string to a
dataset object (matching MATLAB's ndiDataset argument); deleteLocalDocuments
and uploadFilesForDatasetDocuments changed accordingly, and the
orchestration.py call site and sync/__init__ re-exports were updated.
Tests: test_cloud_sync_operations.py (full-document upload, database_add
ingest, additive twoWaySync, failed-upload index guard, idempotent re-ingest,
mirror-mode deletion) and test_cloud_upload_files.py (file_info[].locations
manifest, bulk-branch url/jobId, uploadToNDICloud) run against an in-memory
fake dataset with the cloud seams mocked (no live AWS). test_cloud_sync.py
updated to assert the camelCase index and dual-dialect read. black/ruff
clean; full suite failure set unchanged from origin/main (environmental
only), +13 new cloud tests passing.
Deferred (see MISSION_HANDOFF.md): wiring the public orchestration.syncDataset
through this engine and implementing its mirror modes (audit 3.4-16, PR3);
download-side binary reconciliation; single-archive batch file upload;
validate() content-hash mismatch detection; refreshing the cloud bridge YAML
signatures (PR10).
Several cloud-client routes did not match the backend and 404'd; the scope
validator rejected valid dataset-scoped queries; and there was no retry for
the gateway timeouts the API is prone to. Fixes verified against the
ndi-cloud-node Express routes (the authoritative source) and MATLAB.
- compute.abortSession: POST /compute/{id}/abort -> DELETE /compute/{sessionId}.
The backend aborts via DELETE (the `quit` controller); there is no /abort
route. Matches MATLAB AbortSession (also DELETE).
- compute.finalizeSession: POST /compute/{id}/finalize -> POST
/compute/{sessionId}/advance. The backend route is /advance (the `advance`
controller); there is no /finalize route — the swagger that advertised one
was wrong. (MATLAB FinalizeSession.m still POSTs /finalize and carries the
same latent 404; flagged for the NDI-matlab phase.)
- files.getBulkUploadURL: POST -> GET /datasets/{org}/{ds}/files/bulk (the
`getBulkUploadFileUrl` GET controller). It now delegates to
getFileCollectionUploadURL, which already issued the correct GET, so the two
share one source of truth (the MATLAB getBulkUploadURL.m is an empty stub).
- _validators.Scope: was Literal['public','private','all'], which rejected the
comma-separated 24-hex dataset-id scopes MATLAB's ndiquery accepts. Now a
visibility keyword OR a normalized comma-separated list of 24-hex ObjectIds,
mirroring MATLAB's iMustBeValidScope.
- client._request: retry 502/503/504 on idempotent methods (GET/HEAD/OPTIONS/
PUT/DELETE, never POST), up to MAX_RETRIES with linear backoff. The 504 is
the API Gateway 29 s cap; a brief retry of a safe request often succeeds.
- internal.formatApiError: port of +cloud/+internal/formatApiError.m (HTTP
status line + server message/error body, with the same fallbacks).
Tests: test_cloud_api_contract.py asserts the actual HTTP method+path at the
requests layer for each route, the scope keyword/CSV/reject cases, the retry
behavior (retries idempotent 5xx, not POST, stops at MAX_RETRIES), and
formatApiError. test_cloud_compute.py updated to assert the DELETE abort and
the /advance finalize route. black/ruff clean; full-suite failure set
unchanged from origin/main (environmental only).
Deferred (see MISSION_HANDOFF.md, cloud-coupled / needs a live backend): the
ZIP-bulk document upload byte transfer and documents.bulkUpload sending file
bytes; the helloMatlab compute pipeline; listFiles stabilization and
filesNotYetUploaded re-queue; cross-endpoint pagination standardization.
ndi_neuron inherited ndi_element_class() == "ndi.element" from ndi_element
and was absent from the class registry. MATLAB writes class(obj) == "ndi.neuron"
into element.ndi_element_class (element.m:524), so:
- a MATLAB-written neuron could not be reconstructed — get_class("ndi.neuron")
returned None and _document_to_object raised "Unknown element class", which
getelements() swallowed with `except: pass`, so getelements() returned ZERO
neurons with no error; and
- a Python-written neuron was stored mislabelled as "ndi.element" and
round-tripped as a plain element.
Fixes:
- neuron.ndi_element_class() returns "ndi.neuron".
- element_timeseries.ndi_element_class() returns "ndi.element.timeseries"
(it had the same inherited-mislabel gap; the audit flagged it to check).
- class_registry registers ndi_neuron and ndi_element_timeseries so both
resolve.
- session_base.getelements now surfaces a construction failure (logs the
offending document id and re-raises) instead of silently dropping the
element — a document that isa('element') but won't construct is a
registry/parity gap that must be loud.
Test test_neuron_registry_c8b.py covers the class strings, registry
resolution, the full neuron round-trip through getelements (was zero before),
and that getelements raises on an unregistered element class. black/ruff
clean; element/session/neuron suites pass (the one failure, test_database_clear,
is a pre-existing did-library version mismatch, on origin/main too).
This is the prerequisite for the neuron read path (PR4 C8) and batched neuron
creation (PR12 Kilosort).
Brings ndi.time.syncgraph.time_convert and element_timeseries read/write to MATLAB parity for three of the four time-system critical findings. C5 — syncgraph.time_convert gained the four missing branches: - empty-epoch global resolution (_resolve_in_epochid): a timeref with an empty epoch and a global clock is resolved by scanning the epochtable for the epoch whose t0_t1 contains time+t_in (MATLAB syncgraph.m:653-675); - same-referent shortcut with cross-clock rescale (_same_referent + _same_referent_convert + _rescale), bypassing the graph (:677-700); - destination time-window filtering (:744-746); - equal-cost tie-breaking, sorting candidates by epoch_id then breaking on t_in (+Inf->last, else first) so the result is deterministic (:772-789). C7 — _apply_rules_to_edge now threads the daqsystem through to rule.apply. Trigger-based syncrules (commonTriggersOverlappingEpochs, randomPulses) need the daqsystem to read their trigger trains; without it they early-returned (None, None) and could never produce an edge. The two rule classes were updated to accept and use it. C8 — element_timeseries read/write through VHSB: - new ndi.util.vhsb (vhsb_read/vhsb_write) stores the X (time) axis alongside the Y (data) so the time axis is preserved (the old _store wrote raw datapoints.tobytes() with no header, dropping time); - _store_timeseries_data writes the MATLAB filename epoch_binary_data.vhsb (was timeseries.vhsb), and addepoch attaches it before database_add; - readtimeseries reads the VHSB window [t0,t1] and returns a real ndi_time_timereference (referent=self) instead of timeref=None, matching MATLAB. Tests: test_vhsb_c8.py (VHSB round-trip with time axis, filename, windowing) and test_time_convert_c5c7.py (the four branches + daqsystem threading). black/ ruff clean; targeted time/element suites pass (202), full fast suite failure set unchanged from origin/main (9 environmental did-mismatch failures). DEFERRED on this branch (audit C6): addunderlyingepochs — injecting element/probe epoch nodes (and the cost-77 utc/exp_global equivalence edges) into the graph, plus the missing-node retry in time_convert. _add_underlying_epochs is still a stub and time_convert carries an in-code NOTE; until it lands, element/probe-to-DAQ time conversion that isn't a same-referent or direct-DAQ case won't resolve. This is the hardest piece (graph mutation + retry) and is tracked in MISSION_HANDOFF.md to finish on this branch.
… C5 branch 3) Completes the time-system port. Before this, the syncgraph held only DAQ-system epoch nodes, so time_convert for an element/probe whose epoch was not a directly-resolvable DAQ epoch returned "Could not find source node". epoch/epochset.py: add epochnodes() and underlyingepochnodes() to the base ndi_epoch_epochset (inherited by element/probe), porting ndi.epoch.epochset.epochnodes (epochset.m:321-374) and underlyingepochnodes (epochset.m:393-505): one node per (epoch, clock), the utc->dev_local companion node, and the recursive walk down to sync-graph roots with its cost/mapping sub-graph. time/syncgraph.py: implement _add_underlying_epochs (port of syncgraph.addunderlyingepochs, syncgraph.m:461-550) and wire the lazy missing-node retry into time_convert for both the source (syncgraph.m:716-735) and destination (syncgraph.m:751-762) referents. The sub-graph is merged onto the main graph as a direct node-index overlay that reuses any underlying node already present (e.g. the shared DAQ epoch) and leaves existing edges untouched -- identical in result to vlt.graph.mergegraph, without depending on that helper's incompatible linear-index return. Also adds C5 branch 3: destination candidates are narrowed by the t0_t1 time window when both clocks are global (syncgraph.m:744-746). The cost-77 utc/exp_global equivalence edges (syncgraph.m:526-543) are ported to their documented intent -- connect every node pair sharing an equivalence global clock with a fallback identity edge, filling only pairs that have no cheaper edge. MATLAB's strcmp guard at syncgraph.m:534 indexes the matches array with the outer clock-loop counter rather than the node-pair counters, a latent bug that makes the guard depend on node ordering; the intent-port avoids it. Tests: new tests/test_addunderlyingepochs_c6.py (14) drives the real epochnodes/underlyingepochnodes/merge/retry path through real ndi_element objects (element->device, 3-level element->element->device, device-leaf reuse with self-edge preservation, dest-retry device->element), plus the utc-companion affine offsets, end-to-end non-identity composition, multi-clock fan-out, equivalence edges, and the C5-branch-3 window filter. black + ruff clean; fast suite unchanged at the 9 pre-existing environmental failures. Deferred to PR10 (bridge overhaul, consistent with the C5/C7/C8 commit): epoch and time namespace bridge YAML entries for the new methods. Syncgraph parity across C5-C7 still needs the cross-language symmetry suite against a paired MATLAB checkout (unavailable here) before production trust.
…lator pairOnOff Continues PR8 (after the C8b neuron-registry fix) with three parity items whose MATLAB source is in the audit baseline. session: rename ndi.session.is_fully_ingested -> isIngested to match MATLAB 3cde88c8, keeping is_fully_ingested as a back-compat alias. Update the one internal caller (dataset.add_ingested_session) to the new name. dataset: add isIngested() (true when every session in the dataset is ingested; an empty dataset is ingested) and convertLinkedSessionToIngested(), which copies a linked session's documents and binary files into the dataset and replaces its session_in_a_dataset record with an is_linked=0 one. The document/binary copy loop is factored out of add_ingested_session into a shared _copy_session_documents helper (MATLAB ndi.dataset.copySessionToDataset). probe/timeseries_stimulator: add the pairOnOff(times, signs) static method (MATLAB f1e2ff8c, issue #248) that walks events in time order, pairing each stim-on with the next stim-off and NaN-filling orphans, and use it at the first marker channel instead of separating on/off into mismatched-length arrays. This keeps stimon/stimoff aligned when a read window clips an on without its off (or vice versa). Tests: new tests/test_pr8_session_dataset_stimulator.py (13) covering pairOnOff (balanced, orphan-on, orphan-off, unsorted, empty), session isIngested + alias, dataset.isIngested, and convertLinkedSessionToIngested (convert, confirmation guard, unknown-session and already-ingested errors). black + ruff clean; fast suite unchanged at the 9 pre-existing environmental failures. Remaining PR8 item: element.timeseries.addMultiple (MATLAB cbbb099b) writes VHSB via ndi.util.vhsb, which exists only on the time-system branch (PR4); it is implemented separately on a branch based on that work and is sequenced after PR4.
Port of MATLAB ndi.element.timeseries.addMultiple (cbbb099b), the recommended bulk path for creating many timeseries elements (default ndi.neuron) at once — e.g. the ~hundreds of units from a Kilosort sort. Each spec describes one element with one or more epochs; addMultiple builds all element, element_epoch (with their epoch_binary_data.vhsb data) and caller-supplied extra documents in memory and commits them in chunked database_add calls — elements first, then their dependents — avoiding the per-epoch database search that addepoch performs for every epoch of every element. The element objects are constructed only when build_objects is set, so the importer's statement-form call pays no construction cost. This branch is based on fix/element-neuron-registry (PR8: the C8b neuron registration that makes ndi.neuron constructable, plus session/dataset/stimulator parity) and merges fix/time-system (PR4) because addMultiple writes VHSB directly via ndi.util.vhsb — both must merge before this. PR12 (Kilosort import) builds on this method. Tests: new tests/test_addmultiple.py (8) against a real session — neuron object creation, element/epoch document + dependency structure, the build_objects toggle, extra documents stamped with element_id, chunking, and a full VHSB write -> readtimeseries round-trip of the spike times. black + ruff clean; fast suite unchanged at the 9 pre-existing environmental failures.
…es, VHAudreyBPod transform
C10: ndi.epoch.epochprobemap_daqsystem serialize/decode now match MATLAB —
serialize emits a header row of field names plus one data row per object (was a
single header-less line that crashed MATLAB's decode on int('reference') and
could not represent an array). decode/loadfromfile skip the header and support
arrays; added serialize_array/save_array_to_file/decode_array.
§3.4-4: ndi.daq.system.mfdaq recognizes the analog-event/analog-mark channel
types (aep/aen/aimp/aimn, + digital de/den/dim/dimn in the prefix map) and
strips the _t<threshold> suffix before matching, re-attaching it only to
analog-event prefixes (MATLAB mfdaq_prefix/mfdaq_type, 2157c70f). Added
strip_threshold_suffix and the analog-event entries to standardize_channel_type.
§3.4-5: ndi.daq.metadatareader.VHAudreyBPod.read_audrey_bpod_json ports the
7-stimulus transform (6 solenoids + 1 wash/water), replacing the raw-JSON
passthrough; _read_summary_json applies it when the JSON is a BPod config.
Tests: new tests/test_pr5_daq_epoch.py (12) + an epochprobemap array round-trip
in test_batch_a.py. black + ruff clean.
…4-6); numeric _t threshold §3.4-6: ndi.epoch.epochset.epochgraph now returns the (cost, mapping) matrices MATLAB's ndi.time.syncgraph expects (was a list of node dicts), via a new buildepochgraph that builds an MxM cost matrix and time-mapping matrix over the (epoch, clock) nodes — same-epoch cross-clock pairs rescale t0_t1, others use the clock types' epochgraph_edge. matchedepochtable is realigned to the MATLAB boolean-hash semantics (true iff the cached epoch-table hash equals the argument); its prior entry-lookup signature had no callers. Node enumeration is inlined so this is independent of the time-system branch's epochnodes(). Also tighten the analog-event _t<threshold> strip (§3.4-4) to a trailing numeric suffix only: 'aep_t2.5' -> 'aep' but 'custom_type' is left unchanged (the earlier naive '_t' search wrongly truncated ordinary names). Added threshold_suffix(). Tests: epochgraph matrix shape/rescale/no-link, matchedepochtable hash, and the numeric-only strip guard. black + ruff clean; fast suite back at the 9 baseline failures.
… (§3.4-7) Port MATLAB ndi.time.fun.syncTriggerTrains and syncRandomTriggers into a new src/ndi/time/fun.py: both align two independent clocks recording a common pulse train using quantized inter-pulse-interval fingerprints, then validate and fit a linear model. sync_trigger_trains is drift-aware (dynamic per-pulse tolerance), tolerates one dropped pulse, requires an 80% match rate, and raises ndi:time:sync:ambiguous when distinct alignments compete. sync_random_triggers hashes the longer-duration train and verifies a candidate with a further pulse. Wire the consumers (audit §3.4-7): - common_triggers_overlapping_epochs._sync_triggers now falls back to sync_trigger_trains when the two trains have unequal counts (was a hard ValueError), keeping the direct least-squares fit for the equal-count 1:1 case. - random_pulses._sync_random_triggers replaces its inter-pulse-interval cross-correlation (which diverged on partial-overlap / drifting data) with the fingerprint matcher. Quantization uses MATLAB-compatible round-half-away-from-zero so fingerprint buckets match MATLAB exactly at k+0.5 boundaries (numpy rounds half-to-even). Probe order is deterministic rather than MATLAB's randperm (a speed-only optimization; identical validated result for unambiguous data). Tests: new tests/test_pr5_sync_algorithms.py (13) — exact/drift/dropped-pulse/ partial-overlap recovery, below-rate and unrelated -> NaN, periodic -> ambiguous, random-trigger recovery + partial overlap, and the common-triggers fallback vs direct-fit paths. Adversarial parity review (clean on index/inversion/polyfit/ ambiguity; the np.round mismatch it flagged is fixed). black + ruff clean; fast suite at the 9 baseline failures. NOTE: PR4 (fix/time-system) also edited both syncrule files (apply() daqsystem param) — resolve at merge; regions differ.
… defs, definition cache
§3.4-1: document.dependency_value_n now falls back to the un-numbered dependency
name when there is no <name>_1, skipping an empty placeholder value — matching
MATLAB document.m:365-388 (incl. the 2026-03-25 empty-placeholder skip). Numbered
dependencies still take priority.
§3.4-2: document.__add__ raises ValueError on a duplicate file name instead of
silently de-duplicating via set() (which also lost order) — matching MATLAB's
"Documents have files of the same name. Cannot be combined." Disjoint file lists
now merge in order.
§3.4-3: copied the four missing ndi_common definitions + their schemas from the
MATLAB tree (apps/kilosort/kilosort_clusters, data/filter, data/pyraview,
treatment/treatment_transfer) and restored the depends_on: [{name: document_id}]
block on data/ontologyTableRow{,_schema} that the Python copy had dropped
(MATLAB 5b5b56d5).
§3.6: read_blank_definition is now memoized — the static JSON definitions and
their superclass chains are parsed once and served as deep copies, avoiding a
per-construction disk read + parse of the definition and every superclass.
Callers still receive an independent copy they may mutate.
Tests: new tests/test_pr6_core_docs.py (14) — dependency numbered/fallback/
priority/empty-skip/missing, dup-file raise vs in-order merge, the four new
definitions load, ontologyTableRow document_id dependency, and the cache
(populated + returns independent copies). black + ruff clean; fast suite at the
9 baseline failures.
… registry (§3.4-14)
Uberon and NCIT were listed in ontology_list.json (prefix map + Ontologies) but
had no provider class, so lookup("UBERON:heart") — MATLAB's headline example —
dispatched to a missing provider and silently returned an empty result. Add the
six missing OLS-backed providers (Uberon, NCIT, EDAM, IAO, STATO, SchemaOrg),
register them in PROVIDER_REGISTRY, and extend ontology_list.json with the four
ontologies not yet present (EDAM, IAO, STATO, SchemaOrg) — prefix mappings +
Ontologies metadata pointing at the EBI OLS4 API.
All six route through the existing OLSProvider (label and obo_id search). EDAM
and schema.org use named/sub-typed identifiers rather than a flat numeric series,
so id lookups should pass the full term; label lookups work as usual (noted in
the class docstrings).
Tests: new tests/test_pr7_ontology_providers.py (9) — each provider registered as
an OLSProvider with an ontology slug, the new ontology_list entries present, and
a mocked-OLS lookup("UBERON:0000948") proving the full dispatch->provider->parse
chain now resolves (the fix), plus an unmapped prefix still returning empty.
black + ruff clean; fast suite at the 9 baseline failures.
NOTE: ndi_common/ontology/ontology_list.json full parity with ndi-ontology-matlab
(the 22-entry source) needs that repo, which is not cloned; this adds the headline
ontologies so lookups resolve. Remaining list sync tracked for the ontology
single-sourcing PR (cross-repo §6.2-6).
…§3.6)
Close the High-severity security/packaging findings plus the
security-flavored Medium/Low items.
§3.5 security:
- Secrets at rest (profile.py): write NDI_Cloud_Secrets.json /
NDI_Cloud_Profiles.json with mode 0600 (created private from the first
byte via os.open, no world-readable window), ~/.ndi as 0700; keyring is
already preferred and the AES fallback is documented as obfuscation.
- eval() on document-derived strings: file/navigator fileparameters now
parse with ast.literal_eval, not eval. fun/data.py evaluate_fitcurve's
fit_equation (the audit assumed it was sandboxed) is evaluated by a
restricted arithmetic AST walker -- the {"__builtins__": {}} guard is not
a real sandbox (the ().__class__...__subclasses__ trick reaches
os.system), so this closes a confirmed RCE of the same class.
- Unpinned supply chain: the four git dependencies are pinned to commit
SHAs (no release tags exist upstream yet -- tagging is a follow-up) and
ndi_install.py clones then checks out a pinned ref instead of a floating
branch.
- CI: cloud credentials removed from per-PR ci.yml; the destructive MATLAB
BYOL tests now run on the scheduled (non-PR) workflow; all third-party
actions (incl. ehennestad/matbox-actions) are SHA-pinned; pip caching
added; pytest-xdist + `-n auto` wired in.
- Packaging: removed the 7 MB pythonArtifacts.tar.gz (regenerated by CI) and
gitignored *.tar.gz; dropped the dangling MATLAB_MAPPING.md from
MANIFEST.in/sdist; removed the contradictory Other/Proprietary license
classifier (project is CC-BY-NC-SA-4.0).
- XML: NCBI/Crossref response parsing in ontology/providers.py uses
defusedxml (added as a dependency). crossref.py only generates XML, so it
is intentionally untouched here.
- Download path traversal: downloadGenericFiles and downloadFilesForDocument
basename remote-derived filenames and verify containment before writing.
§3.6 (security-adjacent):
- Hard-fail logging: silent except/pass in client.py and database_fun.py now
logger.debug the exception.
- Login errors no longer embed the full server response body (bounded).
- Test ergonomics: the MATLAB BYOL guard skips (module-level) instead of
erroring collection when the env var is unset, so a bare `pytest tests/`
no longer errors on unconfigured machines while still refusing the
destructive path; @requires_network ontology lookups skip on empty remote
results.
New tests/test_pr9_security.py (26 tests) pins each behavior, including the
fitcurve sandbox-escape rejection and both path-traversal sites. black/ruff
clean; fast suite holds the 9-failure environmental baseline (zero
regressions). 5-lens adversarial verification surfaced the fitcurve RCE,
the downloadFilesForDocument twin traversal, the first-write chmod window,
and the BYOL CI-coverage gap -- all fixed here.
Deferred (logged in handoff): plaintext NDI_CLOUD_TOKEN/PASSWORD env export
(parity with MATLAB, fix in lockstep -- §4); crossref.py corrupted import
(owned by PR1); bridge YAML (PR10).
…3.6)
Bridge-contract reconciliation (the ndi_matlab_python_bridge.yaml contracts
batched here from PR2-PR8, which each deferred their bridge updates to this
PR). Each namespace was reconciled against the actual ported Python code and
the MATLAB source of truth (NDI-matlab @ 2d76370):
- cloud / cloud.sync / cloud.api: first-arg renames (uploadFilesForDataset-
Documents org_id->dataset; sync ops dataset_path->dataset object;
deleteLocalDocuments ds_path->dataset); camelCase index write; client 5xx
retry; route fixes (abortSession DELETE /compute/{id}; finalizeSession POST
/compute/{id}/advance; files.getBulkUploadURL GET); new formatApiError.
- epoch: epochnodes/underlyingepochnodes (PR4); buildepochgraph +
epochgraph->(cost,mapping), matchedepochtable->bool hash, epochprobemap_
daqsystem header-row + *_array methods (PR5).
- time / time.syncrule: addunderlyingepochs + time_convert retry/equivalence
edges (PR4); new ndi.time.fun sync_trigger_trains/sync_random_triggers +
syncrule fingerprint fallbacks (PR5).
- daq / daq.metadatareader: mfdaq analog-event types aep/aen/aimp/aimn +
strip_threshold_suffix/threshold_suffix; VHAudreyBPod read_audrey_bpod_json.
- element / session / dataset: neuron + element.timeseries registry classes,
isIngested (+ alias), dataset.convertLinkedSessionToIngested, addMultiple.
- probe: stimulator.pairOnOff.
- ontology: six new OLS providers documented (UBERON/NCIT/EDAM/IAO/STATO/
SchemaOrg), NCIT(Thesaurus) vs NCIm(Metathesaurus) distinction noted.
- core (ndi root): document dependency_value_n fallback, __add__ dup-file
raise, read_blank_definition memoization decision-logs + sync refresh.
All 14 bridge YAMLs parse cleanly; entries mirror each file's existing style;
matlab_last_sync_hash values verified against the per-file MATLAB commit at
2d76370 (one 9-char hash normalized to the 8-char convention). Produced via
parallel per-namespace grounding + an aggregate consistency audit.
Docs hygiene (§3.6):
- PYTHON_PORTING_GUIDE.md: Black/Ruff line length corrected 88 -> 100 (matches
pyproject + AGENTS.md); §4 rewritten so @pydantic.validate_call applies at
trust boundaries only, not blanket across the core (the locked decision).
- version() git-describe-vs-package-version behavior is already documented in
its docstring (§3.6 item satisfied; noted).
- REPO_AUDIT.md marked SUPERSEDED by ECOSYSTEM_AUDIT_2026-06.md.
MERGE ORDER: this PR's bridge entries describe code introduced by PR2-PR8, so
it must merge AFTER those branches for the contract to match the tree (the
entries are forward-looking on an origin/main base, by the batching design).
Zero Python touched; fast suite holds the 9-failure environmental baseline.
Deferred / flagged for reviewer: cloud files.getBulkUploadURL has a phantom
matlab_path (pre-existing; no +files/getBulkUploadURL.m upstream); the four
new ndi_common JSON doc definitions + ontology_list.json prefix entries are
data (ndi_common namespace), tracked with their PRs, not in these bridges.
…§3.4-13, part 1)
Port the compute methods of three app classes from the MATLAB source of
truth, replacing NotImplementedError stubs. Based on the PR8 stack (neuron
registry + addMultiple) + PR4 time/vhsb.
ndi.app.spikeextractor:
- makefilterstruct + filter via scipy.signal (cheby1/butter per the MATLAB
filter struct); threshold detection + waveform extraction + centering.
- _dotdisc and _refractory are local ports of the vlt MEX/m helpers
(vlt.signal.dotdisc / refractory are absent from the Python vlt port).
Both were VERIFIED line-for-line against the authoritative
vhlab-toolbox-matlab sources (dotdisc.c, refractory.m): dotdisc emits one
event at ceil(i-ptsgood/2) per supra-threshold run; refractory is the
round-based pairwise-diff collapse (NOT single last-kept).
- Waveforms persist via ndi.util.vhsb (the .vsw vhlspikewaveformfile custom
binary format is NOT reimplemented — documented divergence).
ndi.app.oridirtuning:
- tuning-curve computation + vector orientation/direction indices
(oridir_vectorindexes-equivalent; vlt's oridir index module is absent from
the Python port). The _compute_circularvariance / _orientationindex /
_directionindex helpers were VERIFIED against the MATLAB compute_*.m
(exact 1e-4 epsilon + round(100*x)/100), and the doc types corrected to the
real ndi_common paths (stimulus/vision/oridir/...).
ndi.app.spikesorter:
- non-interactive scaffolding (loadwaveforms, clusterinfo init via the
available vlt.neuro.spikesorting.cluster_initializeclusterinfo /
oversamplespikes / spikewaves2pca); validation/struct2doc made vlt-free.
- The interactive sorting path is a BLOCKER, left raising NotImplementedError:
MATLAB drives it through vlt.neuro.spikesorting.cluster_spikewaves_gui (an
interactive GUI, not portable to a headless library) and vlt.spikewaves
(absent from the Python vlt port). No automatic clustering was faked in.
Tests: new tests/test_pr11_{oridirtuning,spikeextractor,spikesorter}.py (42
tests, guarded by pytest.importorskip("vlt") so the standard no-vlt suite
skips them cleanly). The pre-existing tests/test_batch_c.py app-contract
tests were reconciled to the MATLAB-correct contracts they had been
encoding incorrectly (flat extraction-parameter fields per spikeextractor.m,
six sorting-parameter fields per spikesorter.m, real oridir doc-type paths,
is_oridir_stimulus_response single-arg signature). All app modules still
import WITHOUT vlt (deferred imports), so the standard CI env is unaffected.
black + ruff clean; test_batch_c.py (82) + matlab_tests/test_app.py (38) +
the three pr11 suites all green, single-process.
DEFERRED to PR11 part 2 (flagged, not faked):
- ndi.app.markgarbage.identifyvalidintervals: a faithful port requires
migrating valid_interval storage to the schema's array-of-structs model
with reconstructable timeref_structt0/t1 (the ndi_common valid_interval
schema is an array; the current Python stores a single struct with string
timerefs). That storage-contract change cascades through the existing
mocked app tests and warrants its own focused PR. markgarbage is left at
its base state here.
- ndi.app.stimulus.tuning_response (1059 MATLAB lines, stimulus-response
infrastructure) — left as a stub.
- spikeextractor .vsw custom binary format; spikesorter GUI clustering;
oridir double-gaussian fit indices — all blocked on absent vlt deps / a GUI.
…ion 5)
Port ndi.fun.probe.import.kilosort.* + extracellularInfo + plotProbeGeometry
from MATLAB. Imports curated Kilosort/Phy output into NDI: per curated cluster
passing the quality filter, creates an ndi.neuron element (spike times mapped
from the concatenated Kilosort sample stream into each epoch's local time) plus
a neuron_extracellular document (mean waveform, sample counts, cluster index,
quality), and a kilosort_clusters document storing the MD5 of spike_clusters.npy
for change detection. Builds on the PR8 stack (neuron registry + addMultiple)
and PR4 time/vhsb.
Modules (src/ndi/fun/probe/):
- import_/kilosort/{session,probe,getInfo,labels,waveformdata,meanwaveform,
removeold}.py + import_/kilosort/ndi_matlab_python_bridge.yaml. NOTE: the
subpackage is import_ because `import` is reserved in Python -- the one
unavoidable naming divergence; callers use ndi.fun.probe.import_.kilosort.*
(documented in every docstring + the bridge YAML).
- extracellularInfo.py, plotProbeGeometry.py (matplotlib deferred-imported).
- readNPY -> numpy.load; .tsv via stdlib csv (no hard pandas dep).
Correctness (verified):
- 0-based(numpy/Kilosort on disk) vs 1-based(MATLAB) handled explicitly: spike
ids/clusters/templates are 0-based on disk in both; the global->epoch-local
sample mapping drops MATLAB's +1 because the real ndi.probe.timeseries.
samples2times is 0-based (confirmed: "Convert 0-based sample indices to
times"). bounds0 = concat([0], cumsum(epoch_counts)); local0 = g0 - bounds0[e].
- cluster grouping = flatnonzero(spike_clusters==cid); case-insensitive
curation-label match with parallel quality_values; sampleOutOfRange guard;
amplitude-weighted template-average waveform with optional whitening_mat_inv
un-whitening. neurons + docs committed via element.timeseries.addMultiple
(the bulk path MATLAB uses).
ndi_common: ported the kilosort_clusters database + schema docs that probe.py
requires (apps/kilosort/). These are byte-identical to the copies PR6
(fix/core-docs-ndi-common) also adds -- at merge they are the same file (no
conflict); keep one.
Deliberate, documented deviations (NOT faked science):
- progressbar option omitted: MATLAB opens ndi.gui.component.ProgressBarWindow;
NDI-python has no GUI subsystem, so the GUI option is dropped (addMultiple's
verbose logging stands in). Flagged in the bridge YAML omitted_options.
- removeold/extracellularInfo split OR'd depends_on queries into separate
searches + union: ndi_query does not correctly compose depends_on(a)|
depends_on(b) (returns no matches). Same results, the load-time optimization
dropped; flagged in code as a pre-existing engine limitation to follow up.
Tests: tests/test_pr12_kilosort.py (39, single-process) builds a synthetic tiny
Kilosort fixture (hand-built spike_times/clusters/templates/amplitudes +
cluster_group.tsv) with hand-computed expectations, and runs probe()/session()/
removeold()/extracellularInfo() against a real ndi_session_dir with a
deterministic fake probe. black + ruff clean; all modules import in the standard
env (deferred matplotlib/pandas/vlt).
CAVEAT (honest): with no real Kilosort data available, the parsing, cluster
grouping, curation, 0-based sample mapping, out-of-range, and waveform math are
validated deterministically against hand-built arrays, but exact numeric
agreement with a genuine Kilosort/Phy export -- and the assumption that
spike_times.npy aligns with probe.epochtable() concatenation order -- is not
validated end-to-end. Re-verify against a real export before production trust
(same class of caveat as the syncgraph symmetry suite).
…struct storage (§3.4-13 pt2) Complete ndi.app.markgarbage to the MATLAB contract (PR11 part 2, item 1). - markvalidinterval now serializes each timeref into a reconstructable struct (timeref_structt0/timeref_structt1) via the timereference to_dict(), matching the ndi_common valid_interval schema; savevalidinterval stores the schema's ARRAY of interval structs (MATLAB markgarbage.m: load existing array -> skip exact duplicate -> append -> clear old doc -> store the whole array). - identifyvalidintervals (was NotImplementedError) ported: each stored region is rebuilt via timereference.from_struct and projected into the query timeref's referent+clock through the session syncgraph (PR4 time_convert); regions that do not reconstruct/project, or that land in a different epoch, impose no restriction (MATLAB's empty-projection branch); the surviving regions are unioned by an inline _interval_add (the net result of vlt.math.interval_add, implemented inline so this core app keeps NO vlt dependency). - loadvalidinterval reads the array and keeps the MATLAB underlying-element fallback, guarded by isinstance(ndi_element) (mirrors MATLAB isprop) so a duck-typed mock cannot recurse forever. Tests: new tests/test_pr11_markgarbage.py (12, no vlt needed) covers struct serialization, the interval-union, savevalidinterval array accumulation + dedup, and identifyvalidintervals (empty/non-reconstructable/projecting/ wrong-epoch branches). The pre-existing mocked tests in matlab_tests/test_app.py and test_batch_c.py were updated to the schema-correct contract (valid_interval is an array; timeref is stored as timeref_structt0). black + ruff clean; test_app.py (38) + test_batch_c (82) + the markgarbage suite green single-process.
…TLAB (§3.4-13 pt2) Complete ndi.app.stimulus.tuning_response (PR11 part 2, item 2) — the F0/F1 stimulus-response pipeline and tuning-curve assembly, ported from tuning_response.m. Reuses the existing stimulus infra (decoder, fun.stimulus), PR4 syncgraph.time_convert + readtimeseries, and Part 1's markgarbage identifyvalidintervals. Implemented: - control_stimulus (m:546) + label_control_stimuli (m:506): regular/pseudorandom control-stimulus identification (ported _stimids2reps/_findcontrolstimulus, matched to the vlt docstring example) and control_stimulus_ids documents. - compute_stimulus_response_scalar (m:140): per-stimulus on/off windows -> F0 (mean) and, when a temporal frequency is present, F1/F2 responses, with control subtraction and prestimulus normalization, assembling stimulus_response_scalar documents. The F0/F1 math (_fouriercoeffs_tf2 / _stimulus_response_scalar) is a faithful inline port of vlt.math.fouriercoeffs_tf2 + vlt.neuro.stimulus.stimulus_response_scalar (kept inline so the app needs no vlt). VERIFIED against the authoritative fouriercoeffs_tf2.m: tf==0 -> mean(response); else (2/N)*exp(-(1:N)*2*pi*i* tf/SR)*response with a 1-BASED index — numerically exact on a pure cosine (|F1|==amplitude, F0==DC). - stimulus_responses (m:26) orchestration; tuning_curve (m:334) + make_1d_tuning (m:730) aggregation into stimulus_tuningcurve docs (mean/stddev/stderr per unique varied-parameter value, complex magnitude when modulated, control stats). - decoder.load_presentation_time: ported the MATLAB deprecated/inline branch (read presentation_time directly off the document), which UNBLOCKS the whole pipeline for inline-form stimulus_presentation documents. Tests: new tests/test_pr11_tuning_response.py (43, no vlt needed) — F0/F1 incl. the 1-based index convention, stimids2reps/findcontrolstimulus vs the MATLAB docstring example, the full _stimulus_response_scalar pipeline on a synthetic DC+sinusoid (F0~DC, |F1|~amplitude, control subtraction), control_stimulus/ label_control_stimuli/tuning_curve against mocked sessions producing real ndi_documents, and load_presentation_time inline-vs-binary. test_batch_c.py updated: the two now-implemented *_raises tests became no-session-returns-[] / requires-session/requires-parameter. Module imports without vlt; black + ruff clean; full app suite (198) green single-process. REMAINING BLOCKER (flagged, not faked): documents whose timing lives ONLY in the binary presentation_time.bin portion need ndi.database.fun. read_presentation_time_structure + database_openbinarydoc, which are not yet ported; compute_stimulus_response_scalar raises NotImplementedError naming them for binary-only docs. CAVEAT: the F0/F1/control math is validated on synthetic deterministic data + line-by-line vs MATLAB, but end-to-end agreement with a real stimulus recording is unverified without real data.
…-> 70s Loading a dataset from JSON (ndi.dataset.dir(reference, path, documents)) inserted documents with a per-document database.add() loop. Each add() calls the DID driver's get_doc_ids() — a full scan of every existing id — purely to reject duplicates, so inserting N documents re-scans the growing set N times: O(N^2). cProfile on a real dataset confirmed it: 42k SQLite fetchall calls for 3k inserts, ~14 scans per add, dominated by did get_doc_ids. On the 78,687-doc jess-haley dataset this made the load take 40+ minutes (never completed in a test run); per-doc cost grew 0.8 -> 1.4 ms as N rose. Fix: add SQLiteDriver.bulk_add_documents (fetch the existing-id set ONCE, build all DID docs, insert them in a single add_docs call) and expose it as ndi_database.add_documents; route the dataset's documents= bulk-load through it. Per-document problems (missing base.id, duplicate, malformed) are collected and returned as (doc_id, reason) instead of raised, preserving the resilience of the old loop (add_doc_failures is populated the same way). Result on real data: load is now O(N) — constant ~0.63 ms/doc across 1k/2k/4k/8k — and the full 78,687-doc jess-haley dataset loads in ~70s with zero failures and the same document set. black + ruff clean; full fast suite holds the 9-failure environmental baseline (zero regressions); carbon-fiber real-data load is byte-for-byte unchanged. Found while running real-dataset tests. The per-doc add loops in _copy_session_documents / add_ingested_session have the same O(N^2) shape but interleave per-doc binary copying; left for a focused follow-up.
…aseline)
Two related fixes so Python can open datasets created by NDI-matlab (and so the
long-standing "9 environmental failures" disappear).
1. find() no longer calls get_docs_by_branch. The released DID-python has no
such method, so every "fetch all documents" path (ndi_database.find() with
no query) raised AttributeError — this was the entire test_database.py /
test_session.py "9-failure baseline". find() now uses get_doc_ids +
get_docs (both present). Result: the fast suite goes from 9 failed to
0 failed (1572 passed).
2. Open MATLAB-written datasets. A dataset saved by NDI-matlab could not be
opened in Python (silently read as empty) because of four layout/format
differences; each is now handled:
- DB filename: Python writes "did-sqlite.sqlite", MATLAB writes "ndi.db"
(same DID schema). ndi_database._resolve_db_file picks whichever file in
the .ndi dir holds the most documents.
- Branch: Python uses branch "a", MATLAB stores docs on "main". SQLiteDriver
now adopts a populated branch when the requested one is empty.
- Directory layout: MATLAB keeps the dataset under <path>/.ndi_dataset/.ndi,
Python under <path>/.ndi. ndi_dataset_dir._dataset_session_path() prefers
the .ndi_dataset location when present (in both the initial open and the
_discover_correct_session re-open).
- Document format: MATLAB stores a single-element depends_on/superclasses as
a bare dict, which crashes DID's field_search (it iterates expecting a
list of dicts). find() now catches that and falls back to a normalized
brute-force pass (each doc routed through ndi_document, which normalizes
the bare dict to a list, before applying field_search).
Verified on real MATLAB datasets: ndi_dataset(dabrowska) now opens ndi.db on
branch main and searches correctly — isa(base)=14649, ontologyTableRow=6205,
element=606 (exact expected counts); the dabrowska test goes from 3 passed to
32 passed (the remaining failures are unported table-analysis helpers,
unrelated). Normal Python datasets are unaffected (the fallbacks only trigger
for foreign/empty databases): carbon-fiber load byte-identical, full fast suite
1572 passed / 0 failed, black + ruff clean.
Follow-up: the brute-force search fallback re-reads + re-normalizes every
document per query (fine for correctness, slow on large MATLAB datasets). A
one-time import that rewrites ndi.db into a normalized did-sqlite.sqlite would
make subsequent queries use DID's native fast path.
…perf
Checkpoint on the open-matlab-datasets branch (to be folded into the integrated
branch, where the EMPTY-URL and bulk_add_documents pieces come from PR1 and the
perf branch instead of being duplicated here).
Analysis-layer parity — the failures were test-call bugs against already-correct
ports, NOT missing functions (verified: ontologyTableRowDoc2Table returns a
correct (45,51) table; identifyMatchingRows matches MATLAB):
- test_dabrowska / test_jess_haley: stringMatch (not string_match), StackAll
(not stack_all), and *_-unpack the faithful 4-tuple from ontologyTableRowDoc2Table.
- ontology EMPTYProvider URL had the ndi_gui_ corruption (Waltham-ndi_gui_Data-
Science); fixed so lookup("EMPTY:0000074") resolves (PR1 also fixes this).
Result: dabrowska 3 -> 47 passed (all MATLAB value-assertions pass).
Perf — import-once: opening a MATLAB ndi.db imports it once into a normalized
Python did-sqlite.sqlite (cached) so queries use DID's fast native search
instead of the per-query brute-force fallback. The normalization is a
lightweight bare-dict->list rewrite (_normalize_doc_props), NOT a full
ndi_document construction (which would re-read each blank definition from disk
— the §3.6 hot path PR6 memoizes; full construction made a 78k-doc import hang
for >8 min). bulk_add_documents batches the insert (single add_docs).
# Conflicts: # src/ndi/app/spikeextractor.py
# Conflicts: # src/ndi/epoch/epochset.py
…is MATLAB-compatible
The spike pipeline did not round-trip and was not cross-language readable:
- spikeextractor wrote spikewaves.vsw as VHSB bytes (ndi.util.vhsb), but
spikesorter read it with vlt.file.custom_file_formats.readvhlspikewaveformfile
(the real .vsw format) -- which is also absent from the Python vlt port, so the
read path ImportError'd. The two sides never agreed.
- spiketimes.bin was written float32 ("<f4") but read float64 ("<f8") -> garbage.
Add ndi.util.vhlspikewaveformfile, a faithful port of vlt's
new/add/readvhlspikewaveformfile (big-endian 512-byte header: numchannels/S0/S1/
name/ref/comment/samplingrate; float32 data laid out per-waveform as the
column-major flatten of (samples, channels)). spikeextractor now writes the real
.vsw via it; spikesorter reads via it (no vlt dependency). Fix the spiketimes
reader to "<f4" to match the writer and MATLAB fwrite(...,'float32').
Result: a MATLAB-extracted spikewaves.vsw reads here and vice versa, and the
Python extract->sort round-trip is consistent. New tests cover round-trip, the
big-endian MATLAB header/data layout, a hand-built MATLAB file, subset/header-only
reads, and empty waveforms. Full app + fast suites green (1905/0).
…clusters2neurons)
Implements the non-graphical spike-sorting path that was previously a blocker:
- ndi.util.klustakwik: wrapper around the optional klustakwik2 package (masked
KlustaKwik port). Builds dense-feature RawSparseData, clusters via cluster_from
with num_start restarts (best score kept), returns 1-based contiguous cluster
ids. Includes a narrow numpy>=2 compat shim for klustakwik2 0.2.6
(ndarray.tostring -> tobytes). HONEST PARITY: same-family, not bit-identical to
MATLAB's external classic-KlustaKwik binary; clustering is stochastic.
- spikesorter.cluster_initializeclusterinfo: numpy port of the MATLAB
InitClusterInfo mean-waveform computation (the vlt python stub took no args).
- spike_sort: load waveforms -> prepare PCA features -> cluster -> store a
spike_clusters document (clusterinfo + spike_cluster.bin as uint16). Idempotent
+ redo. graphical_mode=1 raises directing callers to the separate PyQt editor.
- clusters2neurons: keep usable (curated) clusters, create ndi.neuron elements +
neuron_extracellular docs + per-epoch spike trains via addMultiple
(build_objects=False, so spike sorting does not pull in the DAQ reader stack).
Fixes two latent bugs in the sorter path now that it is exercised:
- _find_extraction_parameters_doc queried isa('extraction_parameters'); the
extractor stores 'spike_extraction_parameters', so it never matched.
- loadwaveforms / clusters2neurons did not unpack epochtable()'s (table, hash).
Adds the [sorting] optional extra (klustakwik2) and tests/test_spikesorter_clustering.py
(cluster_initializeclusterinfo math, the klustakwik2 wrapper, and end-to-end
spike_sort/clusters2neurons against a real session). Fast suite 1915 passed / 0
failed with vlt+klustakwik2 present; vlt/klustakwik2-gated tests skip cleanly when absent.
…o the canonical engine
Implements the four previously-deferred cloud sync items (PR2 follow-ups), all
live-validated against the real cloud with create->upload->validate->download->delete:
- validate() content comparison: validateSync now downloads the common remote
documents and deep-compares their contents (MATLAB validate.m's mismatch
detection), reporting mismatched_ids + mismatch_details {ndiId, apiId, reason}.
Comparison drops the 'files' field from both sides and the cloud-added id from
the remote (isequaln-style deep equality with NaN==NaN, int/float coercion).
- download-side file sync: downloadNew / mirrorFromRemote / twoWaySync now
materialise downloaded documents' binaries when sync_files=True, reusing the
dataset's on-demand ndic:// fetch (database_openbinarydoc). Upload-side
batch-zip was already implemented.
- syncDataset -> operations.sync routing: orchestration.syncDataset now delegates
to the canonical engine (SyncIndex tracking, issue-805 guard, all five modes
incl. the two mirror modes that were previously a no-op note) and enumerates
local docs from dataset.database_search instead of the broken dataset.session
path. The legacy return shape (sync_mode/cloud_dataset_id/downloaded/uploaded/
deleted) is preserved; the full engine report is attached additively. Removes
the dead _sync_download_new/_sync_upload_new helpers.
- downloadNdiDocuments failed-detection fix (found via live validation): the bulk
download returns bodies keyed by NDI id, not the cloud api id, so matching by
api id flagged every successfully-downloaded doc as failed. Now matches by NDI id.
Adds tests/test_cloud_sync_deferred.py (13 tests; all cloud seams mocked). Fast
suite 1928 passed / 0 failed. Live round-trip confirmed: validate clean = 0
mismatches, validate after a local edit correctly flags the one changed doc,
download failed-list now empty; temp datasets hard-deleted on cleanup.
The automatic sorter's clustering is stochastic (KlustaKwik over the global numpy RNG). Under parallel test execution the spike_sort integration tests could occasionally diverge; seeding np.random before each spike_sort call makes them deterministic and reproducible. No production behaviour change.
…ntology_list PR7 lockstep: ndi_common/ontology/ontology_list.json is now vendored VERBATIM from the canonical ndi-ontology-matlab package (Waltham-Data-Science/ndi-ontology-matlab), the single source of truth shared by the MATLAB and Python consumers. (NDI-matlab has no vendored copy — it already consumes that package — so both are now in lockstep.) - Synced the Python copy to canonical: gains the 'format'->EDAM and 'schema'->SchemaOrg prefix mappings the Python copy lacked, and aligns the EDAM/IAO/STATO/SchemaOrg homepage/api_url metadata (inert in Python — only prefix_ontology_mappings is read at lookup time). - Removed the hardcoded _PREFIX_MAP table in ndi/ontology/__init__.py, which duplicated (and had drifted from) the JSON. All prefix->ontology mappings now load from the JSON alone; lookups remain case-insensitive, and every prefix the hardcoded table carried is present in (and verified against) the JSON. - Lookups dispatch through the canonical prefixes: 'schema:' / 'format:' reach the SchemaOrg / EDAM providers (the non-standard 'SchemaOrg:' prefix is gone; nothing referenced it but a test assertion, now updated). Fast suite 1928/0; ontology tests 60/60.
…export PR12's tests use synthetic Phy output; this adds a skip-gated regression test that runs the importer (labels / waveformdata / meanwaveform / getInfo / probe) against a REAL Kilosort2.5 export — NeuralEnsemble/ephy_testing_data phy_example_0 (32ch, ~10s, 788 spikes, 13 clusters). It exercises the genuine on-disk formats the synthetic fixtures didn't: uint64 spike_times, uint32 spike_clusters/spike_templates, float32 templates, a cluster_group.tsv whose header is 'KSLabel' (not 'group'), and whitening-matrix un-whitening. The importer handles all of it unchanged (it casts loaded arrays to float64 and parses the tsv by position) — no source changes were needed. The real export is NOT vendored; the test reads NDI_KILOSORT_REAL_DIR and skips when unset (same pattern as the live-cloud tests; fetch instructions in the module docstring). Validated: 13 clusters -> 13 neurons, getInfo totals 788 spikes, mean waveforms (82,32). Fast suite 1928 passed / 0 failed (these 5 skip without the data).
The symmetry suite covered util/file/dataset/session but not the time/syncgraph subsystem. This adds the Python side of a time_convert symmetry check (the observable contract of the syncgraph): - tests/symmetry/_time_scenario.py: a self-describing JSON scenario (referent with two multi-clock epochs) + a battery of time_convert CASES, run through the real ndi.time.syncgraph.time_convert. - make_artifacts/time/test_time_convert.py: writes the scenario + computed out_time/out_epoch/msg to pythonArtifacts/time/.../timeConvertCases.json. - read_artifacts/time/test_time_convert.py: (1) re-runs and asserts the current code reproduces the recorded outputs (a cross-run regression guard); (2) compares against matlabArtifacts/time/... when present. PARTIAL by design: the cross-language comparison skips until the MATLAB side generates a matching timeConvertCases.json — full closure needs the MATLAB runtime (the MATLAB makeArtifacts/+time generator is documented as a TODO in the scenario + INSTRUCTIONS). The existing session/dataset make_artifacts still run.
Port of the MATLAB interactive sorter vlt.neuro.spikesorting.cluster_spikewaves_gui, completing the spike-sorter app: graphical_mode=1 now launches a real editor instead of raising. - ndi.app.spikesorter_clustermodel.ClusterModel: the pure-numpy curation core (no Qt) -- a faithful port of the GUI's data commands: cluster-all (KlustaKwik/KMeans), reorder-min-to-max, init/rebuild clusterinfo, make-1-to-N, merge, split, move-to-front, set-quality, set-epochs, the DoneBt finalize (not-present spikes -> NaN), 2-point/PCA features, point-in-polygon lasso geometry, and NaN->0 uint16 export. Unclassified spikes are NaN, as in MATLAB. cluster_all rebuilds clusterinfo (fresh mean shapes) rather than MATLAB's append-only InitClusterInfo, since the mean shape is the neuron's load-bearing mean_waveform. - ndi.app.spikesorter_gui: PyQt/pyqtgraph presentation + event wiring on top of the model (per-cluster waveform overlays with zoom/pan, feature scatter with lasso split/add, merge/quality/epoch controls, DONE/Cancel). Qt is imported lazily; cluster_spikewaves_gui() raises a clear error pointing at the [gui] extra when PyQt/pyqtgraph are absent. - spike_sort(graphical_mode=1) prepares the waveforms (shared with the automatic path) then launches the GUI and writes the curated clusterids/clusterinfo into the same spike_clusters document the automatic path produces (NaN->0 in spike_cluster.bin); a cancelled GUI writes nothing. clusters2neurons consumes the curated result unchanged. - pyproject: new optional [gui] extra (PyQt6, pyqtgraph, pytest-qt). - Tests: headless ClusterModel suite (no display), offscreen PyQt GUI suite (pytest-qt, gated on the deps), graphical-routing + GUI-curated->neurons end-to-end against a real session. Full fast suite green.
A QDialog rejects (cancels) on Escape by default, but the lasso status text invites the user to press Esc to abort the selection. Override keyPressEvent so Escape only cancels an in-progress lasso when one is active, and otherwise falls through to the default Cancel behaviour.
…GC crash)
_ensure_qapp() created (or fetched) the QApplication but returned it without any
caller holding the reference. PyQt garbage-collects an unreferenced QApplication
and deletes the underlying C++ object; the next QWidget construction then aborts
the process with SIGABRT ("Must construct a QApplication before a QWidget").
pytest-qt holds the app itself, so the suite never hit this -- but a standalone
launch (the interactive sorter) would crash whenever a GC ran mid-session.
Cache the instance in a module global so it lives for the process lifetime.
Tests (offscreen): subprocess regression that drops the _ensure_qapp() return,
forces GC, then builds widgets (would SIGABRT without the fix); plus
merge-all-down-to-one, single-spike window, and the real mouse-driven lasso
handler (_on_scene_clicked) path. Verified against negative/positive controls:
no-reference -> exit 134, reference held -> exit 0.
The waveform overlays were unusable to zoom: every redraw called wave_layout.clear(), destroying and rebuilding all panels and resetting their view; each panel zoomed independently; and each held a couple hundred separate curve items (plus ~40 marker lines), so the scene had thousands of items and interaction lagged. - Persist the panels: rebuild the grid only when the cluster COUNT changes, and update content with PlotItem.clear() (keeps the ViewBox range). User zoom/pan now survives quality/marker/subset/dim redraws. - Share one X axis across all panels (setXLink) and constrain the mouse to X with Y auto-fitting the visible window, so the single zoom gesture acts on the shared time axis and the panels can never desynchronise (mirrors MATLAB's shared spike axis). - Render each panel as ONE NaN-separated polyline (connect='finite') instead of hundreds of curves -> ~1 item/panel, smooth zoom even with 200 overlaid spikes. - Draw the cluster mean waveform (the template) as a bold line, plus faint channel separators, so the spike shape is actually readable. - Add a "Reset zoom" button; re-fit the feature scatter when the projection (feature kind / scatter dims) changes. Tests: panels persist across a same-count redraw and share panel[0]'s X axis (mouse constrained to X); the cluster-count change rebuilds; Reset zoom re-fits.
UI polish for the interactive sorter: - optional branded header bar (navy #0b2545 with an accent underline), showing a logo (logo_path -> SVG rendered to a pixmap, omitted gracefully if the SVG backend/file is unavailable), the title, and a right-aligned subtitle. - a light application stylesheet (clean controls, accent-blue DONE button). - antialias the bold mean-waveform line so the template stays crisp when zoomed. cluster_spikewaves_gui()/the window gain logo_path + subtitle parameters (both default off, so existing callers and the 56 tests are unaffected).
…stall) The editable install (`pip install -e .[dev,tutorials]`, used by ndi_install.py and CI) failed with ResolutionImpossible: ndi pinned `vhlab-toolbox-python @ ...@<sha>` while its dependency ndr declares `vhlab-toolbox-python @ ...@main`, and pip refuses two different direct-URL references for the same package — even though @main currently resolves to the exact SHA ndi had pinned. Align ndi to @main so the references unify and the package installs (then `python -m ndi check` and the test suite can run). The other git deps (did, ndr, ndi-compress) stay SHA-pinned. Pinning vhlab-toolbox to a SHA again requires the lockstep NDR-python change to pin the same SHA first; noted inline.
Adds Python 3.13 to the support matrix and fixes the issues a fresh install on
the latest dependencies (and on 3.13) surfaced — the CI test jobs were red
because of these, not just the earlier dependency-resolution conflict.
- 3.13 compat: ndi.common.PathConstants used `@classmethod @property` chaining
for NDI_ROOT/COMMON_FOLDER/DOCUMENT_PATH/SCHEMA_PATH. Python 3.13 removed that
chaining, so the bare class-attribute access returned the descriptor instead
of a Path (`ndi_document('base')` -> "unsupported operand type(s) for /:
'method' and 'str'"). Moved the four constants onto a metaclass so
`ndi_common_PathConstants.DOCUMENT_PATH` returns a real Path, lazily cached
and still overridable via set_paths(), on 3.10 through 3.13.
- cryptography is now a hard dependency: it backs the AES-encrypted, 0600
credential store in ndi.cloud.profile. Without it that backend silently fell
back to a non-persistent in-memory store, so a clean install never wrote (and
never chmod'd) the secrets file — breaking the §3.5-1 security guarantee and
its test. Ships 3.13 wheels.
- test_git_deps_are_pinned_not_floating: replace the blunt `"@main" not in text`
substring check (which also tripped on prose/comments) with a parser that
requires every `pkg @ git+...@ref` to be a SHA/tag, allowing the one
documented vhlab-toolbox-python @main exception.
- CI matrix + classifiers: add 3.13.
Verified on a clean Python 3.13.14 venv (uv) and a clean 3.12 venv: editable
install resolves, `python -m ndi check` = 14/14, and the CI-relevant suite is
1943 passed / 86 skipped / 0 failed on both.
Resolve document superclasses class_name-first with the legacy definition
path as a UNION fallback, so schema-v2 (V_delta/V_epsilon) documents whose
superclasses are bare {class_name} objects resolve correctly while the v1
definition-only corpus (127/127 entries today) behaves exactly as before.
- doc_superclass: read class_name, then UNION the definition-derived name
(never short-circuit); defensive bare-string / non-dict handling.
- read_blank_definition: resolve each superclass type class_name-first, also
following a differing definition path (UNION), which restores field
inheritance for class_name-shaped superclasses.
Mirrors the merged ndi-cloud-node reference contract
(api/src/dal/class_lineage.ts computeClassLineage). Adds
tests/test_schema_v2_dual_accessor.py including the cross-stack union
conformance pin. Spine-1 / Leg A of the schema-v2 platform work.
Complete the oridirtuning residuals so calculate_oridir_indexes emits a fully MATLAB-faithful orientation_direction_tuning document. - _oridir_fitindexes: real Carandini/Ferster double-gaussian fit, delegating to vlt's otfit_carandini + fit2fit* helpers. The leaf functions are imported directly to dodge a name-shadowing bug in the vlt package __init__ (its own oridir_fitindexes wrapper is unusable). Falls back to NaN/empty sentinels when vlt's fit surface is absent. - calculate_tuning_curve: wired to the now-ported ndi.app.stimulus.tuning_response.tuning_curve (tune over angle/direction on sFrequency-bearing stimuli), mirroring the MATLAB; unblocks calculate_all_tuning_curves + struct2doc. - calculate_oridir_indexes: stores the real fit sub-structure. - Refresh stale BLOCKER docstrings (oridirtuning module, plot_oridir_response, tuning_response.compute_stimulus_response_scalar/control_stimulus_ids). Synthetic-recovery verified (recovers known double-gaussian params to ~0.1%); exact MATLAB numerical parity (scipy Nelder-Mead vs fminsearch) flagged for MATLAB cross-validation. Tests updated + a no-vlt sentinel-fallback test added.
…ate_oridir_indexes calculate_oridir_indexes looped over len(individual_responses_real), assuming that array was indexed by direction. Real MATLAB stimulus_tuningcurve documents serialise the individual-response matrices as [repetition, stimulus] (e.g. a Carbon Fiber doc is 5x12 = 5 reps x 12 directions), so the loop produced one value per REPETITION and then `np.vstack([directions, response_mean, ...])` raised "size 12 vs 5" — every real tuning curve crashed the compute path. Derive the direction count from independent_variable_value and orient each individual-response matrix so its direction axis is first (transpose the reps-first MATLAB shape; tolerate already-correct and 1-D shapes), then reduce over repetitions per direction. The directions-first synthetic-fixture path is unchanged. Verified on 20 real Carbon Fiber stimulus_tuningcurve docs (was 100% crash, now 20/20 produce a 12-direction orientation_direction_tuning doc with vector indices + double-gaussian fit). Adds a reps-first regression test that exercises the MATLAB serialisation order. Full fast suite: 1583 passed.
…proach ndi.fun.doc_table.epoch() now mirrors MATLAB ndi.fun.docTable.epoch: - add local_t0/local_t1/global_t0/global_t1 from the ingested DAQ-reader epoch table (daqreader_mfdaq_epochdata_ingested.epochtable; t0_t1[0]=t0s, t0_t1[1]=t1s, columns ordered by epochclock; global = datenum->datetime, truncated to the second to match MATLAB display) - MixtureName/MixtureOntology now parse stimulus_bath.mixture_table (CSV with name/ontologyName columns, unique-stable, comma-joined) instead of the location name - ApproachName/ApproachOntology aggregated from openminds_stimulus.fields - one row per (probe, epoch), grouped probe-major, per-probe EpochNumber - drop columns empty across every row, matching MATLAB - correct column order: ...,SubjectDocumentIdentifier,local_t0,local_t1, global_t0,global_t1,MixtureName,... Verified against the dabrowska tutorial's recorded epochSummary: all 36 displayed epochs match on timing/mixture/approach (0 mismatches); the full tests/matlab_tests/test_dabrowska.py epoch + combinedSummary join + approach filter suites pass.
…t reconstruction
Three fixes that let a downloaded (ingested) dataset's probe build its epoch
table and read timeseries in seconds instead of ~11 minutes:
- daq/system.py epochtable(): the per-epoch line used
entry.get('epoch_id', self.epochid(n)) — dict.get evaluates its default
eagerly, so self.epochid() (which hashes the epoch's files, ~0.4s) ran for
EVERY epoch even though epoch_id is always present. ~670s for a 1604-epoch
dataset. Make the fallback lazy → only call epochid() when epoch_id is
actually missing.
- epoch/epochset.py _compute_hash(): make_hashable deep-recursed the to_dict
of heavy epochset/session back-references embedded in the epoch table (the
DAQ system stored in underlying_epochs['underlying']) — O(N^2)+ and useless
for cache validation. Collapse such objects to a cheap stable identity token.
- database_fun.py ndi_document2ndi_object(): reconstruct the CONCRETE element
subclass declared by element.ndi_element_class (e.g. ndi.probe.timeseries
.mfdaq) via the class registry, preserving the document linkage, so the
object keeps readtimeseries instead of degrading to a bare ndi_element (G1).
Verified: the full unit suite (epoch/probe/daq/element) stays green; a
patch-Vm probe now builds its epoch table and reaches the ingested binary
reader on the dabrowska dataset.
The ingested daqreader epochtable stores t0_t1 in two different layouts depending on who wrote it: NDI-python writes per-clock [t0, t1] pairs (Nclocks rows), while MATLAB ingestion — which produced all the published cloud datasets — writes [t0-row, t1-row] (2 rows: start times then end times, Nclocks columns). t0_t1_ingested assumed the NDI-python layout, so on a MATLAB-ingested dataset it read the global clock's datenum start time (~739053) as the local end time. Via epochtimes2samples_ingested that made readtimeseries(epoch, -inf, inf) request ~7 billion samples → thousands of out-of-bounds .nbf segments → 'binary not found' for every segment. Normalise both layouts to per-clock (t0, t1): unambiguous by shape when Nclocks != 2; for the 2x2 case disambiguate with the global clock (its values are datenums > 1e5, which a per-epoch local time never is) — datenums filling a column => [t0-row, t1-row] (transpose), filling a row => already per-clock. Verified: patch-Vm/patch-I readtimeseries on the dabrowska dataset now returns correct traces (Vm in mV, a clean -100..+150 pA / 10 pA current-step family); the NDI-python-ingested path + all daq/time unit tests stay green.
…ce parity NDI-matlab exposes this under +ndi/+fun/+doc/ (tutorials call it as ndi.fun.doc.ontologyTableRowDoc2Table), but the Python implementation lives in ndi.fun.doc_table — so ndi.fun.doc.ontologyTableRowDoc2Table raised AttributeError, breaking the jess-haley tutorial's table-building step. Re-export it (and a snake_case alias) from ndi.fun.doc; no import cycle since doc_table does not import doc.
Unblocks readtimeseries on element-timeseries elements (e.g. jess-haley
position/distance tracking) for a MATLAB-ingested cloud dataset:
- element.buildepochtable parsed element_epoch.epoch_clock and t0_t1 assuming
Python's list form, but MATLAB writes epoch_clock as a comma-joined string
('dev_local_time,exp_global_time') and t0_t1 as a string of a [t0-row,
t1-row] matrix. Iterating the clock string fed one char at a time to
ndi_time_clocktype ('d' is not a valid clocktype); the matrix was read with
the wrong axis. Parse the string forms and normalise t0_t1 to per-clock
(t0,t1) via the shared daq-reader layout helper.
- element_timeseries._read_from_ingested used database_existbinarydoc (a
LOCAL-only check) and bailed when the VHSB was a cloud (ndic://) reference.
Fall back to database_openbinarydoc, which fetches on demand, then re-check.
Verified on jess-haley: position → (N,2) X/Y pixel tracks over a 1-hr video,
distance → (N,3) distance-to-patch. Full element/timeseries/epoch/daq/vhsb +
dabrowska & jess-haley parity suites stay green (465 passed).
Two OLSProvider bugs surfaced by the jess-haley imageStack format lookup (ndi.ontology.lookup of NCIT format codes like NCIT:C190180 / C85437): - lookup_term only treated all-digit terms as ids; an NCIT code 'C190180' (letter + digits) fell through to a *label* search for the literal string 'C190180', which matches nothing. Route <letters><digits> codes to an exact obo_id search, falling back to label search only if empty. - the OLS4 /search call omitted the synonym field, so _doc_to_result always saw an empty synonym list — but callers use synonyms (the imageStack format is lower(synonyms[0])). Request fieldList incl. synonym/description. Now NCIT:C85437 -> 'Portable Network Graphics' / synonyms ['PNG', ...] -> png; C190180 -> 'MP4 Format' / ['MP4 File Format','MP4',...]. 66 ontology tests pass.
Port of ndi.database.fun.readtablechar: parse a delimited-text char array (the mixture/odor/treatment tables NDI stores inside documents) into a pandas DataFrame. Maps the readtable name-value options NDI uses (Delimiter -> sep, ReadVariableNames -> header) onto pandas.read_csv; accepts options MATLAB-style (..., 'Delimiter', ',') or as kwargs. Needed by the C. elegans-memory tutorial's odorTable + treatment mixtureTable steps.
…when samplerate(epoch) is unavailable extract_epoch_inmemory hard-required element.samplerate(epoch), which cloud-materialized elements return as None (per-epoch rate metadata is not populated in the materialized DID/sqlite store) even though readtimeseries and its time vector read fine — so spike extraction failed on otherwise decodable raw traces. Read the window first, then resolve the rate: prefer the per-epoch accessor (MATLAB-parity path, unchanged), else derive fs = (N-1)/(t_last - t_first) from the returned time vector. Raises the same ValueError only when neither yields a positive rate. +2 tests.
…orientation, download zip-bomb guard Adversarial-review remediation: - spikeextractor: replace banker's rounding at the spike-center index with MATLAB half-away-from-zero (math.floor(N/2+0.5)-1); odd-N windows (the default N=45) reported every spike one sample early. No-op for even N. - oridirtuning: select the directions axis explicitly instead of a shape heuristic that mis-oriented a square reps==directions response matrix. - cloud/download: cap uncompressed size + entry count in the bulk-download ZIP extractor (zip-bomb guard, env-configurable; ZipBombError<-ValueError). + regression tests for all three.
… (diamond) V_epsilon's observation tier is the first DIAMOND hierarchy (body_weight_observation <- scalar_observation AND scalar_mass, both reaching base). Add two conformance pins: (1) the reader resolves isa() for every ancestor of a FLATTENED-diamond document via either parent path, de-duplicated; (2) read_blank_definition flattens a synthetic V_epsilon diamond loop-free (shared ancestor merged once via the _DEFINITION_CACHE memoization). Locks the v1+V_epsilon read contract; no runtime change. 15 passed, ruff clean.
…n, not unported stub
load_presentation_time IS ported; the irregular/no-control case is a real
data limitation, not a missing port. Replace the stale NotImplementedError
('unported stub') with a ValueError carrying an accurate message + update the
docstring Raises: section. Runtime-neutral; 46 tuning_response tests pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Single consolidated review PR for all the 2026-06 NDI-python work. Supersedes #57 (
fix/app-residuals), which is fully contained in this branch — this branch = #57 + the 4 schema-v2 dual-accessor commits on top.integration/ndi-python-allis also fully contained. Please review here and close #57.66 commits · 232 files · +21,589 / −2,524 vs
main.What's in it
doc_superclass+read_blank_definition; the class_name-first union the V_epsilon schema needs; a V_epsilon multiple-inheritance (diamond) parity test; MATLAB-parity spike-center rounding; download zip-bomb guard.Verification
pytest -n auto, excludingtests/symmetry+tests/matlab_testswhich need a MATLAB runtime, and-m "not slow").Notes
class_name-in-superclass branch is forward-compatible (handles both v1{definition:...}and v2{class_name:...}superclass shapes).